non-cooperative inverse reinforcement learning
Non-Cooperative Inverse Reinforcement Learning
Making decisions in the presence of a strategic opponent requires one to take into account the opponent's ability to actively mask its intended objective. To describe such strategic situations, we introduce the non-cooperative inverse reinforcement learning (N-CIRL) formalism. The N-CIRL formalism consists of two agents with completely misaligned objectives, where only one of the agents knows the true objective function.
Reviews: Non-Cooperative Inverse Reinforcement Learning
Comments after rebuttal and discussion Thank you for clarifying my misunderstandings in the rebuttal. I no longer have any major technical concerns and I have adjusted my score to reflect this. It still strikes me as slightly odd that the proposed algorithm does not make use of any data of play, i.e., it isn't really inverse reinforcement learning. Original Review Overall, I enjoyed reading this paper. It is fairly clear and well-written.
Reviews: Non-Cooperative Inverse Reinforcement Learning
The reviewers initially had some concerns about this paper, but the authors addressed these concerns with their response and the sentiment among the reviewers is now clearly positive. I encourage the authors to revise their paper in a way that will, hopefully, clarify things and/or avoid possible reader misunderstandings based upon this review/response cycle.
Non-Cooperative Inverse Reinforcement Learning
Making decisions in the presence of a strategic opponent requires one to take into account the opponent's ability to actively mask its intended objective. To describe such strategic situations, we introduce the non-cooperative inverse reinforcement learning (N-CIRL) formalism. The N-CIRL formalism consists of two agents with completely misaligned objectives, where only one of the agents knows the true objective function. As a result of the one-sided incomplete information, the multi-stage game can be decomposed into a sequence of single- stage games expressed by a recursive formula. Solving this recursive formula yields the value of the N-CIRL game and the more informed player's equilibrium strategy.
Non-Cooperative Inverse Reinforcement Learning
Zhang, Xiangyuan, Zhang, Kaiqing, Miehling, Erik, Basar, Tamer
Making decisions in the presence of a strategic opponent requires one to take into account the opponent's ability to actively mask its intended objective. To describe such strategic situations, we introduce the non-cooperative inverse reinforcement learning (N-CIRL) formalism. The N-CIRL formalism consists of two agents with completely misaligned objectives, where only one of the agents knows the true objective function. As a result of the one-sided incomplete information, the multi-stage game can be decomposed into a sequence of single- stage games expressed by a recursive formula. Solving this recursive formula yields the value of the N-CIRL game and the more informed player's equilibrium strategy.